Picture for Yun Shen

Yun Shen

BadBone: Backdoor Attacks Against Backbone Models in Visual Prompt Learning

Add code
May 29, 2026
Viaarxiv icon

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

Add code
Apr 09, 2026
Viaarxiv icon

When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm

Add code
Mar 25, 2026
Viaarxiv icon

Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks

Add code
Mar 12, 2026
Viaarxiv icon

Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

Add code
Mar 03, 2026
Viaarxiv icon

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

Add code
Feb 23, 2026
Viaarxiv icon

JADES: A Universal Framework for Jailbreak Assessment via Decompositional Scoring

Add code
Aug 28, 2025
Viaarxiv icon

The Ripple Effect: On Unforeseen Complications of Backdoor Attacks

Add code
May 16, 2025
Viaarxiv icon

The Challenge of Identifying the Origin of Black-Box Large Language Models

Add code
Mar 06, 2025
Figure 1 for The Challenge of Identifying the Origin of Black-Box Large Language Models
Figure 2 for The Challenge of Identifying the Origin of Black-Box Large Language Models
Figure 3 for The Challenge of Identifying the Origin of Black-Box Large Language Models
Figure 4 for The Challenge of Identifying the Origin of Black-Box Large Language Models
Viaarxiv icon

Synthetic Artifact Auditing: Tracing LLM-Generated Synthetic Data Usage in Downstream Applications

Add code
Feb 02, 2025
Viaarxiv icon